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The conformation space of a 20-residue antiparallel /3-sheet peptide, sampled by molecular dy- 
namics simulations, is mapped to a network. Conformations are nodes of the network, and the 
transitions between them are links. The conformation space network describes the significant free 
energy minima and their dynamic connectivity without projections into arbitrarily chosen reaction 
coordinates. As previously found for the Internet and the World-Wide Web as well as for social 
and biological networks, the conformation space network is scale-free and contains highly connected 
hubs like the native state which is the most populated free energy basin. Furthermore, the native 
basin exhibits a hierarchical organization which is not found for a random heteropolymer lacking a 
predominant free-energy minimum. The network topology is used to identify conformations in the 
folding transition state ensemble, and provides a basis for understanding the heterogeneity of the 
transition state and denaturated state ensemble as well as the existence of multiple pathways. 

Keywords: complex networks, protein folding, energy landscape, transition state, denaturated state ensemble, 
multiple pathways 
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Proteins are complex macromolecules with many de- 
grees of freedom. To fulfill their function they have to fold 
to a unique three-dimensional structure (native state). 
Protein folding is a complex process governed by nonco- 
valent interactions involving the entire molecule. Sponta- 
neous folding in a time range of microseconds to secondsi 
can be reconciled with the large amount of conformers by 
using energy landscape analysisMi^ . The main difficulty 
of this analysis is that the free-energy has to be projected 
on arbitrarily chosen reaction coordinates (or order pa- 
rameters). In many cases a simplified representation of 
the free-energy landscape is obtained where important in- 
formations on the non-native conformation ensemble and 
the folding transition state ensemble are hidden. More- 
over, the possible transitions between free-energy min- 
ima cannot be displayed in such projections which hin- 
der the study of pathways and folding intermediates. The 
characterization of the free-energy minima and the con- 
nectivity among them, i.e., possible transitions between 
minima, for peptides and proteins is still an unresolved 
problem. 

In the last five years many complex systems, like 
the World-Wide Web, metabolic pathways, and protein 
structures have been modeled as networks^iSii. Intrigu- 
ingly, common topological properties have emerged from 
their organization^. A description of the potential energy 
landscape without the use of any projection has been 
given in terms of networks for a Lennard-Jones cluster of 
atomsSc. 

Here, we introduce complex network analysis^ to study 
the conformation space and folding of betaSs, a designed 
20-residue sequence whose solution conformation has 
been investigated by NMR spectroscopy^*'. The NMR 
data indicate that betaSs in aqueous solution forms a 
monomeric (up to more than ImM concentration) triple- 
stranded antiparallel /?-sheet (Fig. 1, bottom), in equilib- 
rium with the denaturated stateiS. We have previously 
shown that in implicit solvenfeii molecular dynamics sim- 



ulations betaSs folds reversibly to the NMR solution con- 
formation, irrespective of the starting conformationiS*i^. 
We consider conformations sampled by molecular dynam- 
ics simulations and the transitions between them as the 
network nodes and links, respectively. The network anal- 
ysis allows to identify the topological properties that are 
common to both betaSs, which folds to a unique three- 
dimensional structureiflii^, and a random heteropolymer 
which lacks a single preferential conformation like the na- 
tive state despite it has the same residue composition as 
beta3s. These properties include the presence of several 
free-energy minima and highly connected conformations 
(hubs). On the other hand, a hierarchical modularityi^ 
in the proximity of the native state is peculiar of a folding 
sequence. 



I. MODEL AND METHODS 

Molecular dynamics simulations The simulations 
and part of the analysis of the trajectories were per- 
formed with the program CHARMMii. BetaSs was 
modeled by explicitly considering all heavy atoms and 
the hydrogen atoms bound to nitrogen or oxygen atoms 
(PARAM19 force ficld^^). A mean field approximation 
based on the solvent accessible surface was used to de- 
scribe the main effects of the aqueous solvent on the 
soluteii. The two parameters of the solvation model 
were optimized without using betaSs. The same force 
field and implicit solvent model have been used recently 
in molecular dynamics simulations of the early steps of 
ordered aggregationiS,, and folding of structured peptides 
(a-helices and /^-sheets) ranging in size from 15 to 31 
residuesiiiiSiiS, as well as small proteins of about 60 
residuesSSiSi. Despite the absence of collisions with wa- 
ter molecules, in the simulations with implicit solvent 
the separation of time scales is comparable with that ob- 
served experimentally. Helices fold in about 1 ns^S., (3- 
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FIG. 1: BetaSs conformation space network. The size and 
color coding of the nodes reflect the statistical weight w and 
average neighbor connectivity knn, respectively. White, cyan, 
and red nodes have fc„„ < 30, 30 < knn < 70, and knn > 70, 
respectively. Representative conformations are shown by a 
pipe colored according to secondary structure: white stands 
for coil, red for a-helix, orange for bend, cyan for strand and 
the N-terminus is in blue. The variable radius of the pipe 
reflects structural variability within snapshots in a confor- 
mation. The yellow diamonds are folding TS conformations 
(TSEl, TSE2, see text for details) characterized by a con- 
nectivity/weight ratio k/2'w > 0.3, a clustering coefficient 
C < 0.3, and 60 < knn < 80. This figure was made using 
visone (www.visone.de) and MOLMOI^ visualization tools. 



hairpins in about 10 mp"^ and triple-stranded /3-sheets 
in about 100 nsi^, while the experimental values are 
~0.1 /lis^, '--^l /is^i and ~10 /xa^, respectively. Recently, 
four molecular dynamics simulations of betaSs were per- 
formed at 330 K for a total simulation time of 12.6 /xai^. 
There are 72 folding events and 73 unfolding events and 
the average time required to go from the denatured state 
to the folded conformation is 83 ns. The 12.6 /is of sim- 
ulation length is about two orders of magnitude longer 
than the average folding or unfolding time, which are 
similar because at 330 K the native and denatured states 
are almost equally populated^. For the network analysis 
the first 0.65 fis of each of the four simulations were ne- 
glected so that along the 10 /xs of simulations there are a 
total of 5 X 10^ snapshots because coordinates were saved 
every 20 ps. The sequence of the random heteropolymer 
is a randomly scrambled version of the beta3s sequence 



with the same residue composition. It was simulated for 
2 /xs and 10^ snapshots were saved. The conditions for 
the molecular dynamics simulations, i.e., force field, sol- 
vation model, temperature, and time interval between 
saved snapshots were the same for both peptides. 

Construction of the protein folding network To 

define the nodes and links of the network the secondary 
structure was calculated^"* for each snapshot (Carte- 
sian coordinates of the atomic nuclei) saved along the 
molecular dynamics trajectory. A "conformation" is a 
single string of secondary structure^^, e.g., the most 
populated conformation for beta3s (FS in Fig. 1) is 
-EEEESSEEEEEESSEEEE- where "E", "S", and "-" stand 
for extended, turn, and unstructured, respectively. There 
are 8 possible "letters" in the secondary structure "al- 
phabet". Since the N- and C-terminal residues are al- 
ways assigned an " -"— a 20- residue peptide can assume 
gi8 ^ 2_0^^ conformations. Conformations are nodes of 
the network and the transitions between them are links. 
A weight w is assigned to each node to take into ac- 
count the free-energy of each conformation and is equal 
to the number of snapshots with a given secondary struc- 
ture string. The statistical weight w of a node is equal 
to the weight normalized by the total number of snap- 
shots in the simulations (5 x 10^ and 10^ for beta3s and 
the random heteropolymer, respectively). Considering 
all the conformations visited during a, fis — scale simula- 
tion can yield to a computationally intractable network 
size. For this reason we used for the network analysis 
the 1287 conformations of beta3s with significant weight 
{w > 20 per conformation). Two nodes are connected by 
an undirected link (and called neighbors) if they either 
include a pair of snapshots that are visited within 20 ps 
or they are separated by one or more conformations with 
less than 20 snapshots each. For the 2 /is of the random 
heteropolymer a threshold of zl; > 4 was used, so that 
ui > 4 X 10~^ as in the beta3s network. The choice of 
a threshold value is somewhat arbitrary but the network 
properties are robust for a large range of threshold values 
(see Supplementary material). 

The properties of the network are robust also with re- 
spect to the length of the simulation time and the def- 
inition of the nodes. The topological properties are in- 
dependent from simulation lengths if one considers more 
than 2 /xs. The correlation between statistical weight 
and connectivity, as well as power-law behavior of the 
connectivity distribution and 1/fc behavior of the clus- 
tering coefficient distribution (see below) are essentially 
identical after 2, 4, and 10 fis. As an example, the expo- 
nent of the power-law is 2.0 for the beta3s networks based 
on 2, 4 and 10 /is of simulation time. Defining nodes by 
grouping snapshots according to root mean square devi- 
ations (rmsd) in coordinates of Cq-C/3 atoms yields the 
same overall properties i.e., power-law distribution of the 
links (with similar 7 value) and 1/k tail of the clustering 
distribution. Grouping snapshots according to secondary 
structure motifs does not require the use of an arbitrarily 
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chosen RMSD cutoff, and is able to capture the fluctua- 
tions of partially structured conformational. 

Evaluation of Pfoid The TS ensemble can be defined 
as the set of structures which have the same probability 
of folding {Pfoid) or unfolding in trajectories started with 
varying initial conditions^. For each putative TS con- 
formation, the probability to fold before unfolding was 
calculated by 100 very short trajectories at 330 K started 
from ten snapshots within a node. The only difference be- 
tween the ten runs was the seed for the random number 
generator used for the initial assignment of the atomic 
velocities. A trajectory was considered to lead to folding 
(unfolding) if it visits first structures with a fraction of 
native contacts Q > 22/26 (Q < 4/26)^^. 



II. RESULTS AND DISCUSSION 

To study the conformation space network of polypep- 
tides we concentrate on the analysis of topology, i.e., on 
the study of the connectivity between different conforma- 
tions, leaving for a later study the analysis of transition 
rates. We have investigated the network topologies of 
several peptides but on this paper we focus on beta3s 
and the random scrambled version of it. Additional de- 
tails can be find in the Supplementary material where the 
network properties of another structured peptide and a 
glycine homopolymer are presented. 

Conformation space network of a structured pep- 
tide The conformation space network and relevant 
structures of beta3s are shown in Fig. 1. The group of 
nodes at the bottom of Fig. 1 (red nodes) represents the 
native state basin (FS). The native basin is connected 
to a wide region of nodes with significant native content 
(cyan circles in the middle of Fig. 1). Although many het- 
erogeneous routes can be taken to reach the folded state 
(in agreement with lattice simulations^SiSl) , most of the 
folding events have common structural features that de- 
fine two average folding pathways. The less frequented 
average pathway (see Ref^^ but also the density of tran- 
sitions in Fig 1 bottom right) consists of conformations 
that have the N-terminal hairpin formed while the C- 
terminal strand is mostly unstructured with non-native 
hydrogen bonds at the turn (TSEl in Fig. 1). The sec- 
ond and most frequented average pathway includes con- 
formations with a well formed C-terminal hairpin while 
the N-terminal strand is disordered (TSE2 in Fig. 1), 
namely it can be out-of-register or mostly unstructured. 
It is interesting to note that the same two folding path- 
ways were observed experimentally for a 24-residue pep- 
tide with the same folded state as beta3s2&. Furthermore 
multiple folding pathways have recently been detected by 
kinetic analysis of a /3-sandwich proteinSS. 

The denatured state ensemble is very heterogeneous 
and includes high enthalpy, high entropy conformations 
(e.g., the partially helical conformations, denoted HH in 



Fig. 1) but also low enthalpy, low entropy conformations 
(e.g., the curl-like trap, TR). The former are loosely 
linked clusters of conformations with similar secondary 
structure (see Tab. 1) which are characterized by an un- 
favorable effective energy (sum of peptide potential en- 
ergy and solvation energy) and fluctuating unstructured 
residues (e.g., the terminal of the helix shown on top left 
of Fig. 1). On the contrary, low enthalpy, low entropy 
traps form tightly linked clusters with almost identical 
secondary and tertiary structure, favorable effective en- 
ergy (similar to the one of the native structure, see Tab. 
1) and no fluctuating residues (e.g., Fig. 1, top right). 
Taken together, these results indicate that FS is entrop- 
ically favored over low enthalpy conformations like TR, 
i.e., FS has more flexibility than TR. A possible explana- 
tion is that the C-terminal carboxy is involved in four hy- 
drogen bonds in TR (with the backbone NH's of residues 
4-7), whereas both termini undergo rather large fluctua- 
tions in FS. In addition, a more favorable van der Waals 
energy in TR is consistent with a denser packing in TR 
than in FS. 



TABLE 1: Energetic comparison of folded and denat- 
urated state. The free-energy of conformation i is 
J^i — —kBT\og{'Wi), where Wi is the probability along the tra- 
jectory to find the peptide in the conformation i. 



Folded state (FS) 

-EEEESSEEEEEESSEEEE- 
-EEE-STTEEEEESSEEEE- 
-EEEESSEEEEE-STTEEE- 
-EEE-STTEEEE-STTEEE- 

Helical conformations (HH) 

— HHHHHHHHHHS 

-HHHHHHHHHHHHS 

— HHHHHHHHHHTT 

— HHHHHHHHHH 

-HHHHHHHHHHHHTT 

— TT— HHHHHHHHHHHHH- 

Curl-like trap (TR) 

— SSGGG-EEE-STTTEE- 
— SSSS— EEE-STTTEE- 
— S-GGG-EEE-STTTEE- 
— SSGGG-EEE-SGGGEE- 
— SSTTT-EEE-STTTEE- 
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-7.6 
-8.6 
-8.4 
-9.2 



0.9 
-1.9 
0.7 
0.5 
-0.8 
-0.8 



-7.8 
-7.0 
-9.3 
-9.6 
-8.4 





0.1 
0.5 
0.7 



3.1 
3.3 
3.5 
3.7 
3.7 
3.8 



3.4 
3.5 
3.7 
3.7 

3.7 



"Average effective energy 

''Free-energy relative to the most populated conformation. All 
values are in kcal/mol. The conformational entropy of the peptide 
is equal to {{£) — T)IT . Note that the curl-like traps are entropi- 
cally penalized with respect to the native state. 



Note that the network description of non-native con- 
formations is more detailed than the one obtained by 
projecting the free energy surface on progress variables 
(e.g., based on fraction of native contacts). In such pro- 
jections, for low values of the fraction of native contacts 
structures as diverse as helices and the curl-like confor- 
mations mentioned above are not distinguished. Even 
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the ensemble with half of the native contacts is heteroge- 
neous and hard to classify. Using as reaction coordinate 
the RMSD (with respect to a given structure) or the ra- 
dius of gyration is even less selective. Only when a clever 
combination of variables is used it is possible to have a 
more detailed description of the free-energy landscape. 
The network description of the conformation space gives 
a synthetic and systematic view of all the possible con- 
formations accessed by the system and their transitions. 
By considering the statistical weight of the nodes a ther- 
modynamical description of the system is obtained. 
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FIG. 2: Correlation between the statistical weight w and the 
connectivity k for betaSs. The connectivity is proportional 
to log^{w) with a correlation coefficient of 0.88 (solid line). 
The correlation and the fit are calculated over all nodes of 
the network but in the figure logarithmic binning is applied 
to reduce noise. 

The high correlation between the statistical weight of a 
node and its number of links (Fig. 2) shows that the most 
connected nodes are also low lying minima on the free- 
energy landscape. This indicates that the conformation 
space network describes the significant free energy min- 
ima and their dynamic connectivity, without projection, 
where highly populated nodes are minima of free-energy 
and the set of nodes densely connected to them make up 
the basins of such minima. 

Folding and netvi^ork topology The average neigh- 
bor connectivity fc„„ of betaSs (Fig. 3a), i.e., the aver- 
age number of links of the neighbors of a given node, is 
rather heterogeneous, highlighting the presence of differ- 
ent connection rules in different regions of the network. 
This is not the case for the random heteropolymer (Fig. 
3b) whose basins have organization and statistical weight 
similar among each others as previously found for most 
homopolymers'^''. Note that for beta3s the native state 
is well discriminated by fc„„ (red nodes in Fig. 1 and top 
band in Fig. 3a). 

The connectivity distribution of conformation space 
networks shows a well pronounced power-law tail P{k) = 
with 7 = 2.0 for both beta3s and the random 
heteropolymer (Fig. 4a) as well as another structured 



peptide^ and homoglycine, i.e. (Gly)2o (see Supplemen- 
tary material). The power-law is due to the presence 
of few largely connected "hubs" while the majority of 
the nodes have a relatively small number of linka^. 
This behavior has been previously observed for several 
biological^, social^ and technological networks^, which 
in the literature take the name of scale-free networks. In 
terms of free-energy this means that only a few low lying 
minima are present but they act as "hubs" with a large 
number of routes to access them. 

The average clustering coefficient C is a measure of the 
probability that any two neighbors of a node are con- 
nected. Beta3s and the heteropolymer have C values 
of 0.49 and 0.28, respectively. These values are one or- 
der of magnitude larger than random realizations of the 
two networks with the same amount of nodes and links. 
The native basin of betaSs includes the nodes with the 
largest number of links of the network. These nodes give 
rise to the 1/k tail of the clustering distribution (Fig. 
4b), i.e., an inherently hierarchical organizationi^ of the 
conformations in the native basin of beta3s. Such organi- 
zation is not observed for the non-native region of beta3s 
and the random heteropolymer. Note that the power- 
law scaling of the connectivity distribution can be con- 
sidered as a general property of free-energy landscapes of 
polypeptides, whereas a hierarchical organization of the 
nodes reflects a single pronounced free-energy basin of 
attraction (like the native state). 
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FIG. 3: Average neighbor connectivity plotted as a func- 
tion of the statistical weight for the 1287 nodes of betaSs (A) 
and for the 2658 nodes of the random heteropolymer (B). 
k„ri of node i is the average number of links of the neighbors 
of node i. The yellow diamonds are folding transition state 
conformations (see also Fig. 1 and text) characterized by a 
connectivity /weight ratio k/2w > 0.3, a clustering coefficient 
C < 0.3, and 60 < fc„„ < 80. 



Transition state ensemble As mentioned above fold- 
ing is a complex process with many degrees of freedom in- 
volved and it is difficult (or even not possible) to define a 
single reaction coordinate to monitor folding events^^iS. 
Hence, it is very difficult to isolate transition state (TS) 
conformations from equilibrium sampling. The TS con- 
formations are saddle points, i.e., local maxima with re- 
spect to the reaction coordinate for folding and local min- 
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FIG. 4: Topological properties of conformation space net- 
works. Red and blue data points are plotted for betaSs and a 
random heteropolymer, respectively. For a direct comparison, 
the connectivity k is normalized by the average connectivity 
(k) of each network. Logarithmic binning is applied to reduce 
noise. (A) The connectivity distribution P{k) is the proba- 
bility that a node (conformation) has k links (neighbor con- 
formations). The straight line corresponds to a power-law fit 
y = x~~' on the tail of the distribution with 7 = 2.0. (B) The 
clustering coefficient C describes the cliques of a node. For 
node i it is defined as d — Tr-prrryi where ki is the number 
of neighbors of node i and Ui is the total number of connec- 
tions between them. Values of C are averaged over the nodes 
with k links. The straight line corresponds to a power-law fit 
y = x~' 



ima with respect to all other coordinates. For this reason 
we identified the nodes with a high connectivity/ weight 
ratio ki/2wi and low clustering coefficient value Ci as 
putative TS conformations. The former criterion guar- 
antees that these nodes are accessed and exited, most 
of the time, by a different route, i.e., they can be di- 
rectly reached from different conformations of the net- 
work space. The low clustering coefficient value guaran- 
tees that the neighbors of these conformations are likely 
to be disconnected. These two conditions are necessary 
but not sufficient because they do not distinguish fold- 
ing TS conformations from saddle points between un- 
folded conformations. Since the folding TS conforma- 
tions are linked to both nodes in the native state (hav- 
ing large number of links) and in the denatured state 
(small/intermediate number of links), we speculated that 
folding TS conformations should have values of the aver- 
age neighbor connectivity fc„„ within a certain range. For 
nodes with high connectivity/ weight ratio and low clus- 
tering coefficient, a remarkable correlation of 0.89 was 
found between the average neighbor connectivity fc„„ and 
Pfoid (Fig. 5), which is the probability of a given confor- 
mation to fold before unfolding^'"'. A Pfoid value close to 
0.5 is expected for conformations on top of the folding TS 
barrier— and the correlation suggests that network prop- 
erties can be used to predict folding TS conformations. 
These are shown in Fig. 1 and 3a with yellow diamonds. 
As discussed above two main average folding pathways 
are observed. The less frequented one is characterized 
by a transition state ensemble of conformations with the 
first hairpin in a native form (residues 1-13) and a bend 



in correspondence of the the second native turn (residues 
14-15). The C-terminal residues form a straight structure 
with almost no contacts, either native or non-native. The 
second average pathway shows a transition state with the 
second native harpin formed (residues 7-20) and a bend 
in correspondence of the the first native turn (residues 
5-6). Such a symmetrical behavior is presumably due 
to the simplicity and symmetry of the native conforma- 
tion as well as the symmetry in the sequence (sequence 
identity of 67% between the two hairpins). The fold- 
ing TS conformations of beta3s form an heterogeneous 
ensemble with Cq root mean square deviations within 
contributing structures between 3 and 6 A. In contrast 
to previous molecular dynamics studies in which progress 
variables based on fraction of native contacts were used 
to describe TS conformationaiSi^, the network proper- 
ties yield a description of the folding TS ensemble (Fig. 
1) which does not depend on the choice of reaction co- 
ordinates. Interestingly, the folding TS conformations of 
beta3s have about one-half of the native contacts formed 
but this is not a sufficient criterion (Table SI in Sup- 
plementary material). Moreover, there is no correlation 
between the fraction of native contacts and the probabil- 
ity of folding. As a control, Pfoid values smaller than 0.15 
were obtained for five nodes with an average fraction of 
native contacts similar to the folding TS conformations 
but low connectivity/ weight ratio and/or high clustering 
coefficient. 
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FIG. 5: Correlation between Pfoid and average neighbor con- 
nectivity knri. Three nodes used as a control (low connectiv- 
ity/weight ratio and/or high clustering coefficient but similar 
fraction of native contacts) are shown with empty circles. 



III. CONCLUSIONS 

Complex network theory was used to analyze the con- 
formation space of a structured peptide and the one of 
a random heteropolymer of same residue composition. 
Four main results have emerged. First, as it was already 
observed for a variety of networks as diverse as the World- 
Wide Web and the protein interactions in a cell, the con- 
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formation space network of polypeptide chains is a scale- 
free network (power-law behavior of the degree distribu- 
tion). Second, the native basin of the structured peptide 
shows a hierarchical organization of conformations. This 
organization is not observed for the random heteropoly- 
mer which lacks a native state. Third, free energy minima 
and their connectivity emerge from the network analysis 
without requiring projections into arbitrarily chosen re- 
action coordinates. As a consequence it is found that the 
denaturated state ensemble is very heterogeneous and in- 
cludes high entropy, high enthalpy conformations as well 
as low entropy, low enthalpy traps. Fourth, the network 
properties were used to identify transition state confor- 
mations and two main average folding pathways. It was 
found that the average neighbor connectivity knn cor- 
relates with Pfoidi the probability of folding. Pfoid is 



computationally very expensive to evaluate. Hence, it 
will be important to generalize this result by analyzing 
other structured peptides which is work in progress in 
our research group. In conclusion, the network analy- 
sis seems particularly useful to study the conformation 
space and folding of structured peptides including the 
otherwise elusive transition state ensemble. 
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FIG. SI: Dependence of the betaSs network properties on the node-weight threshold. The threshold value used in the present 
work [w = 20) is shown as an empty circle while filled circles correspond to threshold values of, from left to right, 500, 200, 
100, 50, 10, 5, 2 and 1. (A) Relation between the threshold value and the nunrber of nodes. (B) Number of links as a function 
of the number of nodes. When the threshold is very large (i.e., small number of nodes) the network approaches a topology 
where all possible connections are present (solid line, Nunks = Nnodes{Nnodes — l)/2). When the threshold is small (i.e., large 
number of nodes) the network approaches a topology with only one link per node (dashed line, Nunks ~ Nnodes)- (C) Average 
number of links per node (k) as a function of the number of nodes. (D) Average clustering coefficient C as a function of the 
number of nodes. 
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FIG. S2: Dcpoudoucc of the bctaSs connectivity distribution (A) and clustering coeflicient distribution (B) on the node-weight 
threshold. This plot shows that the scale-free behavior and the 1/k tail of the clustering coefficient distribution are robust with 
respect to the choice of threshold values. 
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FIG. S3; Connectivity distribution (left) and clustering coefficient distribution (right) for betaSs (filled circles), another struc- 
tured peptide, i.e., residues 101-111 of a-lact albumin (empty diamonds, Demarest et al., (1999) Biochemistry, 38, 7380), and 
a 20-residue homo-glycine which is unstructured (filled diamonds). 
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